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Abstract 

We  present  a  new  implementation  of  TCP  that  is  better  suited  to 
today’s  Internet  than  TCP  Reno  or  Tahoe.  Our  implementation  of 
TCP,  which  we  call  TCP  Santa  Cruz,  is  designed  to  work  with  path 
asymmetries,  out-of-order  packet  delivery,  and  networks  with  lossy 
links,  limited  bandwidth  and  dynamic  changes  in  delay.  The  new 
congestion-control  and  error-recovery  mechanisms  in  TCP  Santa 
Cruz  are  based  on:  using  estimates  of  delay  along  the  forward 
path,  rather  than  the  round-trip  delay;  reaching  a  target  operat¬ 
ing  point  for  the  number  of  packets  in  the  bottleneck  of  the  connec¬ 
tion,  without  congesting  the  network;  and  making  resilient  use  of 
any  acknowledgments  received  over  a  window,  rather  than  increas¬ 
ing  the  congestion  window  by  counting  the  number  of  returned  ac¬ 
knowledgments.  We  compare  TCP  Santa  Cruz  with  the  Reno  and 
Vegas  implementations  using  the  ns2  simulator.  The  simulation  ex¬ 
periments  show  that  TCP  Santa  Cruz  achieves  significantly  higher 
throughput,  smaller  delays,  and  smaller  delay  variances  than  Reno 
and  Vegas.  TCP  Santa  Cruz  is  also  shown  to  prevent  the  swings  in 
the  size  of  the  congestion  window  that  typify  TCP  Reno  and  Tahoe 
traffic,  and  to  determine  the  direction  of  congestion  in  the  network 
and  isolate  the  forward  throughput  from  events  on  the  reverse  path. 


1  Introduction 

Reliable  end-to-end  transmission  of  data  is  a  much  needed  service 
for  many  of  today’s  applications  running  over  the  Internet  (e.g., 
WWW,  file  transfers,  electronic  mail,  remote  login),  which  makes 
TCP  an  essential  component  of  today’s  Internet.  However,  it  has 
been  widely  demonstrated  that  TCP  exhibits  poor  performance  over 
wireless  networks  [1,  17]  and  networks  that  have  even  small  de¬ 
grees  of  path  asymmetries  [11],  The  performance  problems  of  cur¬ 
rent  TCP  implementations  (Reno  and  Tahoe)  over  internets  of  het¬ 
erogeneous  transmission  media  stem  from  inherent  limitations  in 
the  error  recovery  and  congestion-control  mechanisms  they  use. 

Traditional  Reno  and  Tahoe  TCP  implementations  perform  one 
round-trip  time  estimate  for  each  window  of  outstanding  data.  In 
addition.  Karn's  algorithm  [9]  dictates  that,  after  a  packet  loss, 
round-trip  time  (RTT)  estimates  for  a  retransmitted  packet  cannot 
be  used  in  the  TCP  RTT  estimation.  The  unfortunate  side-effect  of 
this  approach  is  that  no  estimates  are  made  during  periods  of  con¬ 
gestion  -  precisely  the  time  when  they  would  be  the  most  useful. 

*  This  work  was  supported  in  part  at  UCSC  by  the  Office  of  Naval  Research  (ONR) 
under  Grant  N000 14-99-1-0167. 


Without  accurate  RTT  estimates  during  congestion,  a  TCP  sender 
may  retransmit  prematurely  or  after  undue  delays.  Because  all  prior 
approaches  are  unable  to  perform  RTT  estimates  during  periods  of 
congestion,  a  timer-backoff  strategy  (in  which  the  timeout  value  is 
essentially  doubled  after  every  timeout  and  retransmission)  is  used 
to  avoid  premature  retransmissions. 

Reno  and  Tahoe  TCP  implementations  and  many  proposed  al¬ 
ternative  solutions  [14,  15,  20]  use  packet  loss  as  a  primary  indi¬ 
cation  of  congestion;  a  TCP  sender  increases  its  window  size,  until 
packet  losses  occur  along  the  path  to  the  TCP  receiver.  This  poses 
a  major  problem  in  wireless  networks,  where  bandwidth  is  a  very 
scarce  resource.  Furthermore,  the  periodic  and  wide  fluctuation 
of  window  size  typical  of  Reno  and  Tahoe  TCP  implementations 
causes  high  fluctuations  in  delay  and  therefore  high  delay  variance 
at  the  endpoints  of  the  connection  -  a  side  effect  that  is  unaccept¬ 
able  for  delay-sensitive  applications. 

Today’s  applications  over  the  Internet  are  likely  to  operate  over 
paths  that  either  exhibit  a  high  degree  of  asymmetry  or  which  ap¬ 
pear  asymmetric  due  to  significant  load  differences  between  the  for¬ 
ward  and  reverse  data  paths.  Under  such  conditions,  controlling 
congestion  based  on  acknowledgment  (ACK)  counting  as  in  TCP 
Reno  and  Tahoe  results  in  significant  underutilization  of  the  higher 
capacity  forward  link  due  to  loss  of  ACKs  on  the  slower  reverse 
link  [11].  ACK  losses  also  lead  to  very  bursty  data  traffic  on  the 
forward  path.  For  this  reason,  a  better  congestion  control  algorithm 
is  needed  that  is  resilient  to  ACK  losses. 

In  this  paper,  we  propose  TCP  Santa  Cruz,  which  is  a  new  im¬ 
plementation  of  TCP  implementable  as  a  TCP  option  by  utilizing 
the  extra  40  bytes  available  in  the  options  field  of  the  TCP  header. 
TCP  Santa  Cruz  detects  not  only  the  initial  stages  of  congestion, 
but  can  also  identify  the  direction  of  congestion,  i.e.,  it  determines 
if  congestion  is  developing  in  the  forward  path  and  then  isolates  the 
forward  throughput  from  events  such  as  congestion  on  the  reverse 
path.  The  direction  of  congestion  is  determined  by  estimating  the 
relative  delay  that  one  packet  experiences  with  respect  to  another; 
this  relative  delay  is  the  foundation  of  our  congestion  control  algo¬ 
rithm.  Our  approach  is  significantly  different  from  rate-controlled 
congestion  control  approaches,  e.g.,  TCP  Vegas  [2],  as  well  as  those 
that  use  an  increasing  round-trip  time  (RTT)  estimate  as  the  pri¬ 
mary  indication  of  congestion  [22,  18,  21],  in  that  TCP  Santa  Cruz 
does  not  use  RTT  estimates  in  any  way  for  congestion  control.  This 
represents  a  fundamental  improvement  over  the  latter  approaches, 
because  RTT  measurements  do  not  permit  the  sender  to  differen¬ 
tiate  between  delay  variations  due  to  increases  or  decreases  in  the 
forward  or  reverse  paths  of  a  connection. 

TCP  Santa  Cruz  provides  a  better  error-recovery  strategy  than 
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Reno  and  Tahoe  do  by  providing  a  mechanism  to  perform  RTT  esti¬ 
mates  for  every  packet  transmitted,  including  retransmissions.  This 
eliminates  the  need  for  Karn’s  algorithm  and  does  not  require  any 
timer-backoff  strategies,  which  can  lead  to  long  idle  periods  on  the 
links.  In  addition,  when  multiple  segments  are  lost  per  window  we 
provide  a  mechanism  to  perform  retransmissions  without  waiting 
for  a  TCP  timeout. 

Section  2  discusses  prior  related  work  to  improving  TCP  and 
compares  those  approaches  to  ours.  Section  3  describes  the  al¬ 
gorithms  that  form  our  proposed  TCP  implementation  and  shows 
examples  of  their  operation.  Section  4  shows  via  simulation  the 
performance  improvements  obtained  with  TCP  Santa  Cruz  over  the 
Reno  and  Vegas  TCP  implementations.  Finally,  Section  5  summa¬ 
rizes  our  results. 

2  Previous  Work 

Congestion  control  for  TCP  is  an  area  of  active  research;  solutions 
to  congestion  control  for  TCP  address  the  problem  either  at  the  in¬ 
termediate  routers  in  the  network  [8,  13,  6]  or  at  the  endpoints  of 
the  connection  [2,  7,  20.  21,  22], 

Router-based  support  for  TCP  congestion  control  can  be  pro¬ 
vided  through  RED  gateways  [6],  a  solution  in  which  packets  are 
dropped  in  a  fair  manner  (based  upon  probabilities)  once  the  router 
buffer  reaches  a  predetermined  size.  As  an  alternative  to  dropping 
packets,  an  Explicit  Congestion  Notification  (ECN)  [8]  bit  can  be 
set  in  the  packet  header,  prompting  the  source  to  slow  down.  Cur¬ 
rent  TCP  implementations  do  not  support  the  ECN  method.  Kalam- 
poukas  et  al.  [13]  propose  an  approach  that  prevents  TCP  sources 
from  growing  their  congestion  window  beyond  the  bandwidth  de¬ 
lay  product  of  the  network  by  allowing  the  routers  to  modify  the 
receiver’s  advertised  window  field  of  the  TCP  header  in  such  a  way 
that  TCP  does  not  overrun  the  intermediate  buffers  in  the  network. 

End-to-end  congestion  control  approaches  can  be  separated  into 
three  categories:  rate-control,  packet  round-trip  time  (RTT)  mea¬ 
surements,  and  modification  of  the  source  or  receiver  to  return  ad¬ 
ditional  information  beyond  what  is  specified  in  the  standard  TCP 
header  [20].  A  problem  with  rate-control  and  relying  upon  RTT  es¬ 
timates  is  that  variations  of  congestion  along  the  reverse  path  cannot 
be  identified  and  separated  from  events  on  the  forward  path.  There¬ 
fore,  an  increase  in  RTT  due  to  reverse-path  congestion  or  even 
link  asymmetry  will  affect  the  performance  and  accuracy  of  these 
algorithms.  In  the  case  of  RTT  monitoring,  the  window  size  could 
be  decreased  (due  to  an  increased  RTT  measurement)  resulting  in 
decreased  throughput;  in  the  case  of  rate-based  algorithms,  the  win¬ 
dow  could  be  increased  in  order  to  bump  up  throughput,  resulting 
in  increased  congestion  along  the  forward  path. 

Wang  and  Crowcroft’s  DUAL  algorithm  [21]  uses  a  congestion 
control  scheme  that  interprets  RTT  variations  as  indications  of  de¬ 
lay  through  the  network.  The  algorithm  keeps  track  of  the  minimum 
and  maximum  delay  observed  to  estimate  the  maximum  queue  size 
in  the  bottleneck  routers  and  keep  the  window  size  such  that  the 
queues  do  not  fill  and  thereby  cause  packet  loss.  An  adjustment 
of  ±|  cwnd  is  made  to  the  congestion  window  every  other  round- 
trip  time  whenever  the  observed  RTT  deviates  from  the  mean  of  the 
highest  and  lowest  RTT  ever  observed.  RFC  1323  [7]  uses  the  TCP 
Options  to  include  a  timestamp  in  every  data  packet  from  sender 
to  receiver  to  obtain  a  more  accurate  RTT  estimate.  The  receiver 
echoes  this  timestamp  in  each  ACK  packet  and  the  round-trip  time 


is  calculated  with  a  single  subtraction.  This  approach  encounters 
problems  when  delayed  ACKs  are  used,  because  it  is  then  unclear 
to  which  packet  the  timestamp  belongs.  RFC  1323  suggests  that 
the  receiver  return  the  earliest  timestamp  so  that  the  RTT  estimate 
takes  into  account  the  delayed  ACKs,  as  segment  loss  is  assumed 
to  be  a  sign  of  congestion,  and  the  timestamp  returned  is  from  the 
sequence  number  which  last  advanced  the  window.  When  a  hole 
is  filled  in  the  sequence  space,  the  receiver  returns  the  timestamp 
from  the  segment  which  filled  hole.  The  downside  of  this  approach 
is  that  it  cannot  provide  accurate  timestamps  when  segments  are 
lost. 

Two  notable  rate-control  approaches  are  the  Tri-S  [22]  scheme 
and  TCP  Vegas  [2].  Wang  and  Crowcroft's  Tri-S  algorithm  [22] 
computes  the  achieved  throughput  by  measuring  the  RTT  for  a 
given  window  size  (which  represents  the  amount  of  outstanding 
data  in  the  network)  and  comparing  the  throughput  when  the  win¬ 
dow  is  increased  by  one  segment.  TCP  Vegas  has  three  main  com¬ 
ponents:  a  retransmission  mechanism,  a  congestion  avoidance  mech¬ 
anism,  and  a  modified  slow-start  algorithm.  TCP  Vegas  provides 
faster  retransmissions  by  examining  a  timestamp  upon  receipt  of 
a  duplicate  ACK.  The  congestion  avoidance  mechanism  is  based 
upon  a  once  per  round-trip  time  comparison  between  the  ideal  (ex¬ 
pected)  throughput  and  the  actual  throughput.  The  ideal  throughput 
is  based  upon  the  best  RTT  ever  observed  and  the  observed  through¬ 
put  is  the  throughput  observed  over  a  RTT  period.  The  goal  is  to 
keep  the  actual  throughput  between  two  threshhold  values,  a  and  (3, 
which  represent  too  little  and  too  much  data  in  flight,  respectively. 

Because  we  are  interested  in  solutions  to  TCP's  performance 
problems  applicable  over  different  types  of  networks  and  links,  our 
approach  focuses  on  end-to-end  solutions.  Flowever,  our  work  is 
closely  related  to  a  method  of  bandwidth  probing  introduced  by 
Keshav  [10],  In  this  approach,  two  back-to-back  packets  are  trans¬ 
mitted  through  the  network  and  the  interarrival  time  of  their  ACK 
packets  is  measured  to  determine  the  bottleneck  service  rate  (the 
conjecture  is  that  the  ACK  spacing  preserves  the  data  packet  spac¬ 
ing).  This  rate  is  then  used  to  keep  the  bottleneck  queue  at  a  pre¬ 
determined  value.  For  the  scheme  to  work,  it  is  assumed  that  the 
routers  are  employing  round-robin  or  some  other  fair  service  disci¬ 
pline.  The  approach  does  not  work  over  heterogeneous  networks, 
where  the  capacity  of  the  reverse  path  could  be  orders  of  magni¬ 
tude  slower  than  the  forward  path  because  the  data  packet  spacing 
is  not  preserved  by  the  ACK  packets.  In  addition,  a  receiver  could 
employ  a  delayed  ACK  strategy,  which  is  common  in  many  TCP 
implementations,  and  congestion  on  the  reverse  path  can  interfere 
with  ACK  spacing  and  invalidate  the  measurements  made  by  the 
algorithm. 

3  TCP  Santa  Cruz  -  Protocol  Description 

TCP  Santa  Cruz  provides  improvement  over  TCP  Reno  in  two  ma¬ 
jor  areas  :  congestion  control  and  error  recovery.  The  congestion 
control  algorithm  introduced  in  TCP  Santa  Cruz  determines  when 
congestion  exists  or  is  developing  on  the  forward  data  path  -  a  con¬ 
dition  which  cannot  be  detected  by  a  round-trip  time  estimate.  This 
type  of  monitoring  permits  the  detection  of  the  incipient  stages  of 
congestion,  allowing  the  congestion  window  to  increase  or  decrease 
in  response  to  early  warning  signs.  In  addition,  TCP  Santa  Cruz 
uses  relative  delay  calculations  to  isolate  the  forward  throughput 
from  any  congestion  that  might  be  present  along  the  reverse  path. 


The  error  recovery  methods  introduced  in  TCP  Santa  Cruz  perform 
timely  and  efficient  early  retransmissions  of  lost  packets,  eliminate 
unnecessary  retransmissions  for  correctly  received  packets  when 
multiple  losses  occur  within  a  window  of  data,  and  provide  RTT  es¬ 
timates  during  periods  of  congestion  and  retransmission  (i.e.,  elim¬ 
inate  the  need  for  Karn’s  algorithm).  The  rest  of  this  section  de¬ 
scribes  these  mechanisms. 

3.1  Congestion  Control 

3.1.1  Eliminating  RTT  ambiguity  using  rel¬ 
ative  delays 

Round-trip  time  measurements  alone  are  not  sufficient  for  deter¬ 
mining  whether  congestion  exists  along  the  data  path.  Figure  1 
shows  an  example  of  the  ambiguity  involved  when  only  RTT  mea¬ 
surements  are  considered.  Congestion  is  indicated  by  a  queue  along 
the  transmission  path.  The  example  shows  the  transmission  of  two 
data  packets  and  the  returning  ACKs  from  the  receiver.  If  only 
round-trip  time  (RTT)  measurements  were  used,  then  measurements 
RTT\  =  4  and  RTT2  =  5,  could  lead  to  an  incorrect  conclusion 
of  developing  congestion  in  the  forward  path  for  the  second  packet. 
The  true  cause  of  increased  RTT  for  the  second  packet  is  congestion 
along  the  reverse  path,  not  the  data  path.  Our  protocol  solves  this 
ambiguity  by  introducing  the  notion  of  the  relative  forward  delay. 
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Figure  1.  Example  of  RTT  ambiguity 


Relative  delay  is  the  increase  and  decrease  in  delay  that  pack¬ 
ets  experience  with  respect  to  each  other  as  they  propagate  through 
the  network.  These  measurements  are  the  basis  of  our  congestion 
control  algorithm.  The  sender  calculates  the  relative  delay  front 
a  timestamp  contained  in  every  ACK  packet  that  specifies  the  ar¬ 
rival  time  of  the  packet  at  the  destination.  Front  the  relative  delay 
measurement  the  sender  can  determine  whether  congestion  is  in¬ 
creasing  or  decreasing  in  either  the  forward  or  reverse  path  of  the 
connection;  furthermore,  the  sender  can  make  this  determination 
for  every  ACK  packet  it  receives.  This  is  impossible  to  accomplish 
using  RTT  measurements. 

Figure  2  shows  the  transfer  of  two  sequential  packets  transmit¬ 
ted  from  a  source  to  a  receiver  and  labeled  #1  and  #2.  The  sender 
maintains  a  table  with  the  following  two  times  for  every  packet: 
(a)  the  transmission  time  of  the  data  packet  at  the  source,  and  (b) 
the  arrival  time  of  the  data  packet  at  the  receiver,  as  reported  by 
the  receiver  in  its  ACK.  From  this  information,  the  sender  calcu¬ 
lates  the  following  time  intervals  for  any  two  data  packets  i  and  j 
(where  j  >  i)\  Sjj,  the  time  interval  between  the  transmission  of 
the  packets;  and  Rj,i,  the  inter-arrival  time  of  the  data  packets  at 
the  receiver.  From  these  values,  the  relative  forward  delay ,  Dj' ,, 
can  be  obtained: 


Dfj  =  Rj,i  -  Sj.i  (1) 

where  Df{  represents  the  change  in  forward  delay  experienced  by 
packet  j  with  respect  to  packet  i. 


Figure  2.  Transmission  of  2  packets  and  corresponding 
relative  delay  measurements 

Figure  3  illustrates  how  we  detect  congestion  in  the  forward 
path.  As  illustrated  in  Figure  3(a),  when  Df  t  =  0  the  two  packets 
experience  the  same  amount  of  delay  in  the  forward  path.  Fig¬ 
ure  3(b)  shows  that  the  first  packet  was  delayed  more  than  the  sec¬ 
ond  packet  whenever  <  0.  Figure  3(c)  shows  that  the  second 
packet  has  been  delayed  with  respect  to  the  first  one  when  Dft  >  0. 
Finally,  Figure  3(d)  illustrates  out-of-order  arrival  at  the  receiver.  In 
the  latter  case,  the  sender  is  able  to  determine  the  presence  of  mul¬ 
tiple  paths  to  the  destination  by  the  timestamps  returned  from  the 
receiver.  Although  the  example  illustrates  measurements  based  on 
two  consecutive  packets,  TCP  Santa  Cruz  does  not  require  that  the 
calculations  be  performed  on  two  sequential  packets;  however,  the 
granularity  of  the  measurements  depends  on  the  ACK  policy  used. 
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Figure  3.  FORWARD  PATH:  (a)Equai  delay  (b)lst 
packet  delayed  (c)2nd  packet  delayed  (d)non-FIFO  ar¬ 
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3.1.2  Congestion  Control  Algorithm 

At  any  time  during  a  connection,  the  network  queues  (and  specif¬ 
ically  the  bottleneck  queue)  are  in  one  of  three  states:  increasing 


>0 


Figure  4.  Based  upon  the  value  of  Df{,  the  bottleneck 
queue  can  be  currently  Ailing,  draining  or  maintaining. 


in  size,  decreasing  in  size,  or  maintaining  their  current  state.  The 
state  diagram  of  Figure  4  shows  how  the  computation  of  the  rela¬ 
tive  forward  delay,  Dj{ ,  allows  the  determination  of  the  change  in 
queue  state.  The  goal  in  TCP  Santa  Cruz  is  to  allow  the  network 
queues  (specifically  the  bottleneck  queue)  to  grow  to  a  desired  size; 
the  specific  algorithm  to  achieve  this  goal  is  described  next. 

The  positive  and  negative  relative  delay  values  represent  addi¬ 
tional  or  less  queueing  in  the  network,  respectively.  Summing  the 
relative  delay  measurements  over  a  period  of  time  provides  an  in¬ 
dication  of  the  level  of  queueing  at  the  bottleneck.  If  the  sum  of 
relative  delays  over  an  interval  equals  0,  it  indicates  that,  with  re¬ 
spect  to  the  beginning  of  the  interval,  no  additional  congestion  or 
queueing  was  present  in  the  network  at  the  end  of  the  interval.  Like¬ 
wise,  if  we  sum  relative  delays  from  the  beginning  of  a  session,  and 
at  any  point  the  summation  equals  zero,  we  would  know  that  all 
of  the  data  for  the  session  are  contained  in  the  links  and  not  in  the 
network  queues  (assuming  the  queues  were  initially  empty). 

The  congestion  control  algorithm  of  TCP  Santa  Cruz  operates 
by  summing  the  relative  delays  from  the  beginning  of  a  session, 
and  then  updating  the  measurements  at  discrete  intervals,  with  each 
interval  equal  to  the  amount  of  time  to  transmit  a  windowful  of 
data  and  receive  the  corresponding  ACKs.  Since  the  units  of  Dft 
is  time  (seconds),  the  relative  delay  sum  must  then  be  translated 
into  an  equivalent  number  of  packets  (queued  at  the  bottleneck) 
represented  by  this  delay.  In  other  words,  the  algorithm  attempts  to 
maintain  the  following  condition: 

nti  =  Nop  =  mf.  ,  +  1/ivy  ,  (2) 

where  is  the  total  number  of  packets  queued  at  the  bottleneck  at 
time  ti‘,  Nop  is  the  operating  point  (the  desired  number  of  packets, 
per  session,  to  be  queued  at  the  bottleneck);  M w,  _ ,  is  the  addi¬ 
tional  amount  of  queueing  introduced  over  the  previous  window 
Wi-i  \  and  ntl  =  MWo- 

The  operating  point:  The  operating  point,  Nop,  is 

the  desired  number  of  packets  to  reside  in  the  bottleneck  queue. 
The  value  of  Nop  should  be  greater  than  zero;  the  intuition  behind 
this  decision  is  that  an  operating  point  equal  to  zero  would  lead 
to  underutilization  of  the  available  bandwidth  because  the  queues 


are  always  empty,  i.e.,  no  queueing  is  tolerated.  Instead,  the  goal 
is  to  allow  a  small  amount  of  queueing  so  that  a  packet  is  always 
available  for  forwarding  over  the  bottleneck  link.  For  example,  if 
we  choose  Nop  to  be  1.  then  we  expect  a  session  to  maintain  1 
packet  in  the  bottleneck  queue,  i.e.,  our  ideal  or  desired  congestion 
window  would  be  one  packet  above  the  bandwidth  delay  product 
(BWDP)  of  the  network. 


Translating  the  relative  delay:  The  relative 

delay  gives  an  indication  of  the  change  in  network  queueing,  but 
provides  no  information  on  the  actual  number  of  packets  corre¬ 
sponding  to  this  value.  We  translate  the  sum  of  relative  delays  into 
the  equivalent  number  of  queued  packets  by  first  calculating  the  av¬ 
erage  packet  service  time,  pkt.S  ( sec/pkt ),  achieved  by  a  session 
over  an  interval.  This  rate  is  of  course  limited  by  the  bottleneck 
link. 

Our  model  of  the  bottleneck  link,  depicted  in  Figure  5,  consists 
of  two  delay  parameters:  the  queueing  delay,  tq\  and  the  output 
service  time,  pktS  (the  amount  of  time  spent  servicing  a  packet). 
The  queueing  delay  is  variable  and  is  controlled  by  the  congestion 
control  algorithm  (by  changing  the  sender’s  congestion  window) 
and  by  network  cross-traffic.  The  relative  delay  measurements  pro¬ 
vide  some  feedback  about  this  value.  The  output  rate  of  a  FIFO 
queue  will  vary  according  to  the  number  of  sessions  and  the  bursti- 
ness  of  the  arrivals  from  competing  sessions.  The  packet  service 
rate  is  calculated  as 

pktS=  #  Received  (3) 

where  R  is  the  difference  in  arrival  time  of  any  two  packets  as 
calculated  from  the  timestamps  returned  by  the  receiver.  Because 
pktS  changes  during  an  interval,  we  calculate  the  average  packet 
service  time,  pktS ,  over  the  interval.  Finally,  we  translate  the  sum 
of  relative  delays  over  the  interval  into  the  equivalent  number  of 
packets  represented  by  the  sum  by  dividing  the  relative  delay  sum¬ 
mation  by  the  average  time  to  service  a  packet.  This  gives  us  the 
number  of  packets  represented  by  the  delay  over  an  interval: 
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where  k  are  packet-pairs  within  window  U  ,  i .  The  total  queueing 
in  the  system  at  the  end  of  the  interval  is  determined  by  Eq.  2. 


Bottleneck  Link 
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Figure  5.  Bottleneck  Link:  Delay  consists  of  two  parts: 
tq,  the  delay  due  to  queueing  and  pktS,  the  packet  ser¬ 
vice  time  over  the  link. 


Adjusting  the  window:  The  TCP  Santa  Cruz  con¬ 

gestion  window  is  adjusted  such  that  Eq.  2  is  satisfied  within  a  range 
of  Nop  ±  d,  where  6  is  some  fraction  of  a  packet.  Adjustments  are 
made  to  the  congestion  window  only  at  discrete  intervals,  i.e..  in  the 
time  taken  to  empty  a  windowful  of  data  from  the  network.  Over 
this  interval,  M w,  _ ,  is  calculated  and  at  the  end  of  the  interval  it  is 


added  to  ntj_1 .  If  the  result  falls  within  the  range  of  Nop  ±  6 ,  the 
congestion  window  is  maintained  at  its  current  size.  If,  however, 
nti  falls  below  Nop  —  6,  the  system  is  not  being  pushed  enough 
and  the  window  is  increased  linearly  during  the  next  interval.  If 
nti  rises  above  Nop  4-  5,  then  the  system  is  being  pushed  too  high 
above  the  desired  operating  point,  and  the  congestion  window  is 
decreased  linearly  during  the  next  interval. 

In  TCP  Reno,  the  congestion  control  algorithm  is  driven  by 
the  arrival  of  ACKs  at  the  source  (the  window  is  incremented  by 
1/cwnd  for  each  ACK  while  in  the  congestion  avoidance  phase). 
This  method  of  ACK  counting  causes  Reno  to  perform  poorly  when 
ACKs  are  lost  [11];  unfortunately,  ACK  loss  becomes  a  predomi¬ 
nant  feature  in  TCP  over  asymmetric  networks  [11],  Given  that  the 
congestion  control  algorithm  in  TCP  Santa  Cruz  makes  adjustments 
to  the  congestion  window  based  upon  delays  through  the  network 
and  not  on  the  arrival  of  ACKs  in  general,  the  algorithm  is  robust 
to  ACK  losses. 

Startup  Currently,  the  algorithm  used  by  TCP  Santa  Cruz  at 
startup  is  the  slow  start  algorithm  used  by  TCP  Reno  with  two 
modifications.  First,  the  initial  congestion  window,  cwnd,  is  set  to 
two  segments  instead  of  one  so  that  initial  values  for  D[j  can  be 
calculated.  Second,  the  algorithm  may  stop  slow  start  before 
ssthresh  >  cwnd  if  any  relative  delay  measurement  or  ntt  ex¬ 
ceeds  Nop/ 2.  Once  stopped,  slow  start  begins  again  only  if 
a  timeout  occurs.  During  slow  start,  the  congestion  window 
doubles  every  round-trip  time,  leading  to  an  exponential  growth  in 
the  congestion  window.  One  problem  with  slow  start  is  that 
such  rapid  growth  often  leads  to  congestion  in  the  data  path  [2], 
TCP-SC  reduces  this  problem  by  ending  slow  start  once  any 
queue  buildup  is  detected. 

3.2  Error  Recovery 

3.2.1  Improved  RTT  estimate 

TCP  Santa  Cruz  provides  better  RTT  estimates  over  traditional  TCP 
approaches  by  measuring  the  round-trip  time  (RTT)  of  every  seg¬ 
ment  transmitted  for  which  an  ACK  is  received,  including  retrans¬ 
missions.  This  eliminates  the  need  for  Karn’s  algorithm  [9]  (in 
which  RTT  measurements  are  not  made  for  retransmissions)  and 
timer-backoff  strategies  (in  which  the  timeout  value  is  essentially 
doubled  after  every  timeout  and  retransmission).  To  accomplish 
this,  TCP  Santa  Cruz  requires  each  returning  ACK  packet  to  indi¬ 
cate  the  precise  packet  that  caused  the  ACK  to  be  generated  and 
the  sender  must  keep  a  timestamp  for  each  transmitted  or  retrans¬ 
mitted  packet.  Packets  can  be  uniquely  identified  by  specifying 
both  a  sequence  number  and  a  retransmission  copy  number.  For 
example,  the  first  transmission  of  packet  1  is  specified  as  1.1,  the 
second  transmission  is  labeled  1.2,  and  so  forth.  In  this  way,  the 
sender  can  perform  a  new  RTT  estimate  for  every  ACK  it  receives. 
Therefore,  ACKs  from  the  receiver  are  logically  a  triplet  consist¬ 
ing  of  a  cumulative  ACK  (indicating  the  sequence  number  of  the 
highest  in-order  packet  received  so  far),  and  the  two-element  se¬ 
quence  number  of  the  packet  generating  the  ACK  (usually  the  most 
recently  received  packet).  For  example,  ACK  (5.7.2)  specifies  a  cu¬ 
mulative  ACK  of  5,  and  that  the  ACK  was  generated  by  the  second 
transmission  of  a  packet  with  sequence  number  7.  As  with  tradi¬ 
tional  TCP  implementations,  we  do  not  want  the  RTT  estimate  to 
be  updated  too  quickly;  therefore,  a  weighted  average  is  computed 


for  each  new  value  received.  We  use  the  same  algorithm  as  TCP 
Tahoe  and  Reno;  however,  the  computation  is  performed  for  every 
ACK  received,  instead  of  once  per  RTT. 

3.2.2  ACK  Window 

To  assist  in  the  identification  and  recovery  of  lost  packets,  the  re¬ 
ceiver  in  TCP  Santa  Cruz  returns  an  ACK  Window  to  the  sender  to 
indicate  any  holes  in  the  received  sequential  stream.  In  the  case  of 
multiple  losses  per  window,  the  ACK  Window  allows  TCP-SC  to 
retransmit  all  lost  packets  without  waiting  for  a  TCP  timeout. 

The  ACK  Window  is  similar  to  the  bit  vectors  used  in  previous 
protocols,  such  as  NETBLT  [4]  and  TCP-SACK  [5][15],  Unlike 
TCP-SACK,  our  approach  provides  a  new  mechanism  whereby  the 
receiver  is  able  to  report  the  status  of  every  packet  within  the  cur¬ 
rent  transmission  window.1  The  ACK  Window  is  maintained  as  a 
vector  in  which  each  bit  represents  the  receipt  of  a  specified  num¬ 
ber  of  bytes  beyond  the  cumulative  ACK.  The  receiver  determines 
an  optimal  granularity  for  bits  in  the  vector  and  indicates  this  value 
to  the  sender  via  a  one-byte  field  in  the  header.  A  maximum  of  19 
bytes  are  available  for  the  ACK  window  to  meet  the  40-byte  limit 
of  the  TCP  option  field  in  the  TCP  header.  The  granularity  of  the 
bits  in  the  window  is  bounded  by  the  receiver’s  advertised  window 
and  the  18  bytes  available  for  the  ACK  window;  this  can  accommo¬ 
date  a  64K  window  with  each  bit  representing  450  bytes.  Ideally,  a 
bit  in  the  vector  would  represent  the  MSS  of  the  connection,  or  the 
typical  packet  size.  Note  this  approach  is  meant  for  data  intensive 
traffic,  therefore  bits  represent  at  least  50  bytes  of  data.  If  there  are 
no  holes  in  the  expected  sequential  stream  at  the  receiver,  then  the 
ACK  window  is  not  generated. 

Figure  6  shows  the  transmission  of  five  packets,  three  of  which 
are  lost  and  shown  in  grey  (1,3,  and  5).  The  packets  are  of  vari¬ 
able  size  and  the  length  of  each  is  indicated  by  a  horizontal  arrow. 
Each  bit  in  the  ACK  window  represents  50  bytes  with  a  1  if  the 
bytes  are  present  at  the  receiver  and  a  0  if  they  are  missing.  Once 
packet  #1  is  recovered,  the  receiver  would  generate  a  cumulative 
ACK  of  1449  and  the  bit  vector  would  indicate  positive  ACKs  for 
bytes  1600  through  1849.  There  is  some  ambiguity  for  packets  3 
and  4  since  the  ACK  window  shows  that  bytes  1550  -  1599  are 
missing.  The  sender  knows  that  this  range  includes  packets  3  and 
4  and  is  able  to  infer  that  packet  3  is  lost  and  packet  4  has  been 
received  correctly.  The  sender  maintains  the  information  returned 
in  the  ACK  Window,  flushing  it  only  when  the  window  advances. 
This  helps  to  prevent  the  unnecessary  retransmission  of  correctly 
received  packets  following  a  timeout  when  the  session  enters  slow 
start. 

Packet  number  1  2  3  4  5 
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Figure  6.  ACK  window  transmitted  from  receiver  to 
sender.  Packets  1,  3  and  5  are  lost. 


1  TCP-SACK  is  generally  limited  by  the  TCP  options  field  to  reporting  only  three 
unique  segments  of  continuous  data  within  a  window. 


3.2.3  Retransmission  Policy 

Our  retransmission  strategy  is  motivated  by  such  evidence  as  the 
Internet  trace  reports  by  Lin  and  Kung,  which  show  that  85%  of 
TCP  timeouts  are  due  to  “non-trigger”  [12].  Non-trigger  occurs 
when  a  packet  is  retransmitted  by  the  sender  without  previous  at¬ 
tempts,  i.e.,  when  three  duplicate  ACKs  fail  to  arrive  at  the  sender 
and  therefore  TCP's  fast  retransmission  mechanism  never  happens. 
In  this  case,  no  retransmissions  can  occur  until  there  is  a  timeout  at 
the  source.  Therefore,  a  mechanism  to  quickly  recover  losses  with¬ 
out  necessarily  waiting  for  three  duplicate  ACKs  from  the  receiver 
is  needed. 

Given  that  TCP  Santa  Cruz  has  a  much  tighter  estimate  of  the 
RTT  time  per  packet  and  that  the  TCP  Santa  Cruz  sender  receives 
precise  information  on  each  packet  correctly  received  (via  the  ACK 
Window),  TCP  Santa  Cruz  can  determine  when  a  packet  has  been 
dropped  without  waiting  for  TCP's  Fast  Retransmit  algorithm. 
TCP  Santa  Cruz  can  quickly  retransmit  and  recover  a  lost  packet 
once  any  ACK  for  a  subsequently  transmitted  packet  is  ‘received 
and  a  time  constraint  is  met.  Any  lost  packet  y,  initially  transmitted 
at  time  1,  is  marked  as  a  hole  in  the  ACK  window.  Packet  y  can 
be  retransmitted  once  the  following  constraint  is  met:  as  soon  as  an 
ACK  arrives  for  any  packet  transmitted  at  time  tx  (where  tx  >  ti), 
and  t  cur  rent  ~  t’  >  RTT ,  where  tcurrent  is  the  current  time  and 
RTT  is  the  estimated  round-trip  time  of  the  connection.  There¬ 
fore,  any  packet  marked  as  unreceived  in  the  ACK  window  can  be 
a  candidate  for  early  retransmission. 

3.3  Proposed  Implementation 

TCP  Santa  Cruz  can  be  implemented  as  a  TCP  option  containing 
the  fields  depicted  in  Table  1.  The  TCP  Santa  Cruz  option  can 
vary  in  size  from  11  to  40  bytes,  depending  on  the  size  of  the  ACK 
window  (see  Section  3.2.2). 


Held 

Size 

(bytes) 

Description 

Kind 

i 

kind  ot  protocol 

Length 

1 

length  held 

Data.copy 

4  (bits) 

retrans.  number  ot  data  pkt 

ACK.copy 

4  (bits) 

retrans.  number  ot  data  pkt  gen.  ACK 

ACK.sn 

4 

SN  ot  data  pkt  generating  ACK 

Timestamp 

4 

aiT.  time  ot  data  pkt  generating  ACK 

ACK  Window  Granularity 

1 

num.  bytes  represented  by  each  bit 

ACK  Window 

0-18 

holes  in  the  receive  stream 

Table  1.  TCP  Santa  Cruz  options  field  description 


4  Performance  Results 

In  this  section  we  examine  the  performance  of  TCP  Santa  Cruz 
compared  to  TCP  Vegas  [3]  and  TCP  Reno  [19].  We  first  show  per¬ 
formance  results  for  a  basic  configuration  with  a  single  source  and  a 
bottleneck  link,  then  a  single  source  with  cross-traffic  on  the  reverse 
path,  and  finally  performance  over  asymmetric  links.  We  have  mea¬ 
sured  performance  for  TCP  Santa  Cruz  through  simulations  using 
the  “ns”  network  simulator  [16].  The  simulator  contains  imple¬ 
mentations  of  TCP  Reno  and  TCP  Vegas.  TCP  Santa  Cruz  was 
implemented  by  modifying  the  existing  TCP-Reno  source  code  to 
include  the  new  congestion  avoidance  and  error-recovery  schemes. 
Unless  stated  otherwise,  data  packets  are  of  size  1Kbyte,  the  max¬ 
imum  window  size,  cwndjmax  for  every  TCP  connection  is  64 
packets  and  the  initial  ssthresh  is  equal  to  *  cwndjmax.  All 


simulations  are  an  FTP  transfer  with  a  source  that  always  has  data 
to  send;  simulations  are  run  for  10  seconds.  In  addition,  the  TCP 
clock  granularity  is  100ms  for  all  protocols. 

4.1  Basic  Bottleneck  Configuration 

Our  first  experiment  shows  protocol  performance  over  a  simple  net¬ 
work,  depicted  in  Figure  7,  consisting  of  a  TCP  source  sending 
1Kbyte  data  packets  to  a  receiver  via  two  intermediate  routers  con¬ 
nected  by  a  1.5Mbps  bottleneck  link.  The  bandwidth  delay  product 
(BWDP)  of  this  configuration  is  equal  to  16.3Kbytes;  therefore,  in 
order  to  accommodate  one  windowful  of  data,  the  routers  are  set  to 
hold  17  packets. 


Q  =  17  Q  =  17 

Figure  7.  Basic  bottleneck  configuration 

Figures  8  (a)  and  (b)  show  the  growth  of  TCP  Reno’s  conges¬ 
tion  window  and  the  queue  buildup  at  the  bottleneck  link.  Once  the 
congestion  window  grows  beyond  17  packets  (the  BWDP  of  the 
connection)  the  bit  pipe  is  full  and  the  queue  begins  to  fill.  The 
routers  begin  to  drop  packets  once  the  queue  is  full;  eventually 
Reno  notices  the  loss,  retransmits,  and  cuts  the  congestion  win¬ 
dow  in  half.  This  produces  see-saw  oscillations  in  both  the  window 
size  and  the  bottleneck  queue  length.  These  oscillations  greatly  in¬ 
crease  not  only  delay,  but  also  delay  variance  for  the  application.  It 
is  increasingly  important  for  real-time  and  interactive  applications 
to  keep  delay  and  delay  variance  to  a  minimum. 

In  contrast,  Figures  9  (a)  and  (b)  show  the  evolution  of  the 
sender’s  congestion  window  and  the  queue  buildup  at  the  bottleneck 
for  TCP  Santa  Cruz.  These  figures  demonstrate  the  main  strength 
of  TCP  Santa  Cruz:  adaptation  of  the  congestion  control  algorithm 
to  transmit  at  the  bandwidth  of  the  connection  without  congesting 
the  network  and  without  overflowing  the  bottleneck  queues.  In  this 
example  the  threshold  value  of  Arop,  the  desired  additional  number 
of  packets  in  the  network  beyond  the  BWDP,  is  set  to  Nop  =  1.5. 
Figure  9(b)  shows  the  queue  length  at  the  bottleneck  link  for  TCP 
Santa  Cruz  reaches  a  steady-state  value  between  1  and  2  packets. 
We  also  see  that  the  congestion  window,  depicted  in  Figure  9(a) 
reaches  a  peak  value  of  18  packets,  which  is  the  sum  of  the  BWDP 
(16.5)  and  Nop.  The  algorithm  maintains  this  steady-state  value  for 
the  duration  of  the  connection. 

Table  2  compares  the  throughput,  average  delay  and  delay  vari¬ 
ance  for  Reno,  Vegas  and  Santa  Cruz.  For  TCP  Santa  Cruz  we  vary 
the  amount  of  queueing  tolerated  in  the  network  from  Nop  =  1  to  5 
packets.  All  protocols  achieve  similar  throughput,  with  Santa  Cruz 
n  =  5  performing  slightly  better  than  Reno.  The  reason  Reno’s 
throughput  does  not  suffer  is  that  most  of  the  time  the  congestion 
window  is  well  above  the  BWDP  of  the  network  so  that  packets 
are  always  queued  up  at  the  bottleneck  and  therefore  available  for 
transmission.  What  does  suffer,  however,  is  the  delay  experienced 
by  packets  transmitted  through  the  network. 

The  minimum  forward  delay  through  the  network  is  equal  to 
40  msec  propagation  delay  plus  6.9  msec  packet  forwarding  time, 
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Figure  8.  TCP  Reno:  (a)  congestion  window  (b)  bottleneck  queue 
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Figure  9.  TCP  Santa  Cruz:  (a)  congestion  window  (b)  bottleneck  queue 


yielding  a  total  minimum  forward  delay  of  approximately  47  msec. 
Reno  is  the  clear  loser  in  this  case  with  not  only  the  highest  average 
delay,  but  also  a  high  delay  variation.  Santa  Cruz  with  Nop  = 
1.5  provides  the  same  average  delay  as  Vegas,  but  a  with  lower 
delay  deviation.  As  Nop  increases,  the  delay  in  Santa  Cruz  also 
increases  because  more  packets  are  allowed  to  sit  in  the  bottleneck 
queue.  Also,  throughput  is  seen  to  grow  with  Nop  because  not  only 
does  the  slowstart  period  last  longer,  but  the  peak  window  size 
is  reached  earlier  in  the  connection,  leading  to  a  faster  transmission 
rate  earlier  in  the  transfer.  In  addition,  with  a  larger  Nop  it  is  more 
likely  that  a  packet  is  available  in  the  queue  awaiting  transmission. 


j  Comparison  ol  Throughput  and  Average  Delay  j 

Protocol 

Throughput 

(Mbps) 

Utilization 

Ave.  delay 
(msec) 

Delay  variance 
(msec) 

Reno 

1.45 

T37 

5071 

2.06 

SC  n=l.S 

1.42 

63.1 

- oiron 

SC  n=3 

1.45 

TI37 

60.6 

0.0063 

SC  n=3 

1.47 

0778 

79.2 

0.0073 

Vegas (1,3) 

pro 

0771 

55.2 

0.0077 

Table  2.  Throughput,  delay  and  delay  variance  compar¬ 
isons  for  Reno,  Vegas  and  Santa  Cruz  for  basic  bottle¬ 
neck  configuration. 


4.2  Traffic  on  Reverse  Link 

This  section  looks  at  how  data  flow  on  the  forward  path  is  affected 
by  traffic  and  congestion  on  the  reverse  path.  Figure  10  shows  that, 
in  addition  to  a  TCP  source  from  A  to  the  Receiver,  there  is  also  a 
TCP  Reno  source  from  B  to  Router  1  in  order  to  cause  congestion 
on  the  reverse  link. 

Figure  11  (a)  shows  that  the  congestion  window  growth  for 
Reno  source  A  is  considerably  slower  compared  to  the  case  when 
no  reverse  traffic  is  present  in  Figure  8(a).  Because  Reno  grows 
its  window  based  on  ACK  counting,  lost  and  delayed  ACKs  on  the 
reverse  path  prevent  Source  A  from  filling  the  bit  pipe  as  fast  as  it 
could  normally  do.  resulting  in  low  link  utilization  on  the  forward 


Figure  10.  Traffic  on  the  reverse  link 


path.  In  addition,  ACK  losses  also  delay  the  detection  of  packet 
losses  at  the  source,  which  is  waiting  for  three  duplicate  ACKs  to 
perform  a  retransmission.  In  contrast,  Figure  1 1  (b)  shows  that  the 
congestion  window  for  TCP  Santa  Cruz  (Nop  =  5)  is  relatively  un¬ 
affected  by  the  Reno  traffic  on  the  reverse  path  and  reaches  the  op¬ 
timal  window  size  of  around  22  packets,  demonstrating  TCP  Santa 
Cruz’s  ability  to  maintain  a  full  data  pipe  along  the  forward  path  in 
the  presence  of  congestion  on  the  reverse  path. 

Table  3  shows  the  throughput  and  delay  obtained  for  Reno, 
Santa  Cruz  and  Vegas.  Santa  Cruz  achieves  up  to  a  68%  improve¬ 
ment  in  throughput  compared  to  Reno  and  a  78%  improvement  over 
Vegas.  Because  of  the  nearly  constant  window  size,  the  variation 
delay  with  our  algorithm  is  considerably  lower  than  Reno.  Vegas 
suffers  from  low  throughput  in  this  case  because  its  algorithm  is  un¬ 
able  to  maintain  a  good  throughput  estimate  because  of  high  varia¬ 
tion  in  RTT  measurements.  Vegas  exhibits  low  delay  primarily  due 
to  its  low  utilization  of  the  bottleneck  link;  this  insures  that  packets 
are  never  queued  at  the  bottleneck  and  therefore  do  not  incur  any 
additional  queueing  delay  from  source  to  destination. 

4.3  Asymmetric  Links 

In  this  section  we  investigate  performance  over  networks  that  ex¬ 
hibit  asymmetry,  e.g.,  ADSL,  HFC  or  combination  networks,  which 


Reno:  congestion  window  Santa  Cruz:  congestion  window 


Figure  11.  Comparison  of  congestion  window  growth  when  TCP-Reno  traffic  is  present  on  the  reverse  path:  (a)  Reno  (b)  TCP 
Santa  Cruz  n=5 


|  Comparison  ol  Throughput  and  Average  Delay  f 

Protocol 

Throughput 

(Mbps) 

Utilization 

Ave.  delay 
(msec) 

Delay  variance 
(msec) 

Reno 

r  0323 

0.55 

TOO 

56.2 

SC  n=1.5 

ran 

on 

543 

0.0034 

SC  n=3 

OT2 

on 

60.6 

0.0057 

SC  n=5 

090 

0 3J2 

73.4 

0.0080 

Vegas (1,3) 

OT78 

032 

403 

0.0016 

Table  3.  Throughput,  delay  and  delay  variance  compar¬ 
isons  with  traffic  on  the  reverse  link. 


may  have  a  high  bandwidth  cable  downstream  link  and  a  slower 
telephone  upstream  link.  TCP  has  been  shown  to  perform  poorly 
over  asymmetric  links  [11]  primarily  because  of  ACK  loss,  which 
causes  burstiness  at  the  source  (the  size  of  the  bursts  are  propor¬ 
tional  to  the  degree  of  asymmetry)  and  leads  to  buffer  overflow 
along  the  higher  bandwidth  forward  path;  and  also  reduced  through¬ 
put  because  of  slow  window  growth  at  the  source  due  to  lost  ACK 
packets.  Lakshman  et.  al.  [11]  define  the  normalized  asymmetry 
k  of  a  path  as  the  ratio  of  the  transmission  capacity  of  data  packets 
on  the  forward  path  to  ACK  packets  on  the  reverse  path.  This  is 
an  important  measurement  because  it  means  the  source  puts  out  k 
times  as  many  data  packets  as  the  reverse  link  has  capacity.  Once 
the  queues  in  the  reverse  path  fill,  only  one  ACK  out  of  k  will  make 
it  back  to  the  receiver.  Each  ACK  that  does  arrive  at  the  source 
then  generates  a  burst  of  k  packets  in  the  forward  path.  In  addition, 
during  congestion  avoidance,  the  window  growth  will  be  slowed  by 
1/k  as  compared  to  a  symmetric  connection. 


Figure  12.  Simulation  configuration  for  asymmetric  links 


The  simulation  configuration  depicted  in  Figure  12  has  been 
studied  by  Lakshman  et.  al.  in  detail  and  is  used  here  to  exam¬ 
ine  performance.  In  this  configuration  the  forward  buffer,  Bj  = 
9  packets.  Using  1  Kbyte  data  packets  this  results  in  a  normalized 
asymmetry  factor  k  =  3. 

Figure  13  (a)  shows  the  congestion  window  growth  for  Reno. 
Because  of  the  burstiness  of  the  connection  due  to  ACK  loss,  there 
are  several  lost  data  packets  per  window  of  data,  causing  Reno  to 
suffer  timeouts  every  cycle  (that  is  why  the  congestion  window  re¬ 
duces  to  1  packet).  Figure  13  (b)  shows  the  development  of  the 


window  with  TCP  Santa  Cruz.  In  this  case,  the  congestion  window 
settles  a  few  packets  above  the  BWDP  (equal  to  31  packets)  of  the 
connection.2  During  slow  start  there  is  an  initial  overshoot 
of  the  window  size  during  one  round-trip  time  delay,  i.e.,  the  final 
round  before  the  algorithm  picks  up  the  growing  queue,  a  burst  of 
packets  is  sent,  which  ultimately  overflows  the  buffer. 

A  comparison  of  the  overall  throughput  and  delay  obtained  by 
a  Reno,  Vegas  and  Santa  Cruz  (  V„;,  =  1.5,  Nop  =  3  and  Nop  =  5) 
sources  is  shown  below  in  Table  4.  This  table  shows  that  Reno  and 
Vegas  are  unable  to  achieve  link  utilization  above  52%.  Because 
of  the  burstiness  of  the  data  traffic,  Santa  Cruz  needs  an  operating 
point  of  at  least  Nop  =  3  in  order  to  achieve  high  throughput.  For 
Arop  =  3  and  Nop  =  5  Santa  Cruz  is  able  to  achieve  99%  link 
utilization.  The  end-to-end  delays  for  Reno  are  around  twice  that 
of  Santa  Cruz  and  the  delay  variance  is  seven  orders  of  magnitude 
greater  than  Santa  Cruz.  Because  Vegas  has  such  low  link  utiliza¬ 
tion  the  queues  are  generally  empty,  thus  there  is  a  very  low  delay 
and  no  appreciable  delay  variance. 


|  Comparison  ol  Throughput  and  Average  Delay  j 

Protocol 

Throughput 

(Mbps) 

Utilization 

Ave.  delay 
(msec) 

Delay  variance 
(fi  sec) 

Reno 

053 

032 

8.4 

1400 

SC  n=1.5 

T275 

0.53 

3.5 

0.0004 

SC  n=3 

2X72 

039 

4.6 

075005 

SC  n=5 

23.73 

039 

4.8 

0.0003 

Vegas (1,3) 

o ~m 

0.33 

3.3 

0.0000 

Table  4.  Throughput,  delay  and  delay  variance  over 
asymmetric  links. 


5  Conclusion 

We  have  presented  TCP  Santa  Cruz,  which  implements  a  new  ap¬ 
proach  to  end-to-end  congestion  control  and  reliability,  and  that  can 
be  implemented  as  a  TCP  option.  TCP  Santa  Cruz  makes  use  of  a 
simple  timestamp  returned  from  the  receiver  to  estimate  the  level 
of  queueing  in  the  bottleneck  link  of  a  connection.  The  protocol 
successfully  isolates  the  forward  throughput  of  the  connection  from 
events  on  the  reverse  link  by  considering  the  changes  in  delay  along 
the  forward  link  only.  We  successfully  decouple  the  growth  of  the 
congestion  window  from  the  number  of  returned  ACKs  (the  ap¬ 
proach  taken  by  TCP),  which  makes  the  protocol  resilient  to  ACK 
loss.  The  protocol  provides  quick  and  efficient  error-recovery  by 
identifying  losses  via  an  ACK  window  without  waiting  for  three 

2  See  Lakshman  et.  al.  [11]  for  a  detailed  analysis  of  the  calculation  of  the  BWDP. 
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Figure  13.  Comparison  of  congestion  window  growth:  (a)  Reno  (b)  TCP  Santa  Cruz 


duplicate  ACKs.  An  RTT  estimate  for  every  packet  transmitted  (in¬ 
cluding  retransmissions)  allows  the  protocol  to  recover  from  lost 
retransmissions  without  using  timer-backoff  strategies,  eliminating 
the  need  for  Karn’s  algorithm. 

Simulation  results  show  that  TCP  Santa  Cruz  provides  high 
throughput  and  low  end-to-end  delay  and  delay  variance  over  net¬ 
works  with  a  simple  bottleneck  link,  networks  with  congestion  in 
the  reverse  path  of  the  connection,  and  networks  which  exhibit  path 
asymmetry.  We  have  shown  that  TCP  Santa  Cruz  eliminates  the 
oscillations  in  the  congestion  window,  but  still  maintains  high  link 
utilization.  As  a  result,  it  provides  much  lower  delays  than  current 
TCP  implementations.  For  the  simple  bottleneck  configuration  our 
protocol  provides  a  20%  -  45%  improvement  in  end-to-end  delay 
(depending  on  the  value  of  Nop)  and  a  delay  variance  three  orders 
of  magnitude  lower  than  Reno.  For  experiments  with  congestion 
on  the  reverse  path,  TCP  Santa  Cruz  provides  an  improvement  in 
throughput  of  at  least  47%  -  67%  over  both  Reno  and  Vegas,  as 
well  as  an  improvement  in  end-to-end  delay  of  45%  -  59%  over 
Reno  with  a  reduction  in  delay  variance  of  three  orders  of  magni¬ 
tude.  When  we  examine  networks  with  path  asymmetry,  Reno  and 
Vegas  achieve  link  utilization  of  only  52%  and  33%,  respectively, 
whereas  Santa  Cruz  achieves  99%  utilization.  End-to-end  delays 
for  this  configuration  are  also  reduced  by  42%  -  58%  over  Reno. 

Our  simulation  experiments  indicate  that  our  end-to-end  ap¬ 
proach  to  congestion  control  and  error  recovery  is  very  promising, 
and  our  current  work  focuses  on  evaluating  the  fairness  of  TCP 
Santa  Cruz,  its  coexistence  with  other  TCP  implementations,  and 
its  performance  over  wireless  networks. 
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